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© Method and apparatus for performing memory protection operations in a parallel processor 
system. 

© In a multiprocessor system, memory accesses by the individual processing elements are checked by a 
common controller (320). The controller includes a table of values defining valid memory locations for a task. 
The controller verifies the address value used by each instruction to ensure that, it is within a valid memory area 
for the particular task. Additional circuitry for the controller and processing elements allows finer control of 
memory accessibility. The multiprocessor system may be coupled to a host computer through a buf er (308). 
Data is serially written into the buffer (308) by the host (302) and is read out of the buffer (308) in parallel by the 
multiprocessor system. The buffer used in this system includes apparatus which calculates an error correction 
code from a serial data stream and passes this code, along with the data, to the multiprocessor system. The 
multiprocessor system includes apparatus (310) which processes the data in parallel to handle errors occurring 
during transfers as indicated by the code. 
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METHOD AND APPARATUS FOR PERFORMING MEMORY PROTECTION OPERATIONS IN A PARALLEL 

PROCESSOR SYSTEM 



Field of the Invention 

The present invention relates generally to multiprocessor parallel computing systems and particularly to 
methods and apparatus for maintaining the integrity of data processed by such systems. 

5 

Background of the Invention 

Multiprocessor parallel computing systems have recently become available which may be coupled to a 
host computer to enhance its performance. Generally, an attached parallel processing system of this type 

70 has a relatively limited instruction set. It is designed to perform simple, repetitive operations in parallel and, 
so, reduce the elapsed time for processing a program. A system of this type is generally coupled to a 
communications bus of the host computer and is treated as an input/output (I/O) device. 

The most common types of attached multiprocessor systems are the Multiple-Instruction Multiple Data 
(MiMD) systems and the Single- Instruction Multiple Data (SIMD) systems. An MIMD system is a 

75 conventional multiprocessor system where each processor may execute a separate program operating on a 
separate data set. The processors in a system of this type may perform separate tasks or they may each 
perform a different sub-task of a common main task. 

In an SIMD system, each processor may have a different set of data in its associated memory, but all 
processors are governed by a common controller, and perform the same operations on each of the different 

20 data sets. Processors of this type may be used, for example, for simulation programs in which the effects of 
a stimulus on a set of points spanning an area or a volume are calculated simultaneously. 

When either of these two types of systems is coupled to a host computer, instructions and data are 
transferred between the multiprocessor system and the host computer via a communications bus. 

Many computer systems include apparatus which continually checks the validity of the data being 

25 processed. This apparatus ranges from parity checking circuitry to circuitry which inserts and analyzes error 
correcting codes (ECC's). Although apparatus of this type may be used to maintain data integrity separately 
in the host computer and in the multiprocessor system, it may be difficult to verify the integrity of data 
transferred between the two systems. 

To illustrate how these problems may occur, consider an exemplary multiprocessor system, the 

30 Polymorphic-Torus network, which is described in a paper by H. Li et al. entitled "Polymorphic-Torus 
Network" Proc. Int. Conference on Parallel Processing, pp 411-414, 1987. This system is an SIMD 
processor network in which N 2 bit-serial processors are arranged in an NxN matrix. Assuming the host 
computer uses K-bit words in its data processing, data values are transferred to the multiprocessor system 
in groups of N 2 K-bit words. In a typical application, these data values may be stored into a buffer as N 2 K- 

35 bit words and may be shifted out of the buffer into the N 2 bit-serial processors as K N 2 -bit words. Any ECC 
incorporated in the K-bit words generated by the host would be difficult to use in the attached multiproces- 
sors. Similarly, any ECC developed by the multiprocessors would be difficult to use in tho host processor. 

An SIMD multiprocessor system may be used in a multiprogramming environment, that is to say, the 
system may run multiple programs on a time-slico basis. For example, when a program running on the 

40 multiprocessor system enters a wait state, e.g. to perform an I/O operation, another program may be 
activated to run on the system. When this second program enters a wait state, the first program is 
reactivated. Operating the system in this manner is generally more efficient than restricting it to execute 
each program to completion before starting the next program. However, there is a potential for data 
corruption if one program is allowed access to data locations used by another program while the other 

45 program is inactive. 

U.S. Patent No. 4,773,038 to Hiilis et al. relates to an SIMD system in which each memory associated 
with one of the processing elements may be subdivided. Each processing element operates on the contents 
of each subdivision sequentially to simulate a greater number of processors. 

U.S. Patent No. 4,727,474 to Batcher relates to a multiprocessor system which has a staging memory 
50 system that includes error detection and correction apparatus for data used by the multiprocessor system. 

U.S. Patent No. 4,636,942 to Chen et al. relates to a computer system that has multiple independent 
processors. The system includes a set of shared registers which are used to coordinate access to 
resources that are common to all of the processors. 

U.S. Patent No. 4,569,052 to Cohn et al. relates to apparatus for protecting computer memory which 
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uses a parity matrix to generate an error correcting code. 

U.S. Patent No. 4,523,273 to Adams. Ill et al. relates to a multistage data routing system wh.ch includes 
error correction and error detection apparatus. 

US Patent No. 4,299,790 to Gilliand et al. relates to a MIMD system which includes apparatus for 
checking memory accesses against base and length parameters for a task. If an attempted access is found 
to be out of range, the task is suspended. Data transfers between asynchronous tasks are facilitated by 
semaphores implemented in hardware. 

US Patent No. 4,101,960 to Stokes et al. relates to an SIMD computer system which includes 
apparatus that contains bounds and descriptions of vectors defined in a memory space. Memory access 
errors may be checked by this apparatus to provide early detection of errors in vector processing. 

Summary of the Invention 

The present invention is embodied in a multiprocessor system in which memory accesses by individual 
processing elements are checked by a common controller. The controller includes a table of values defining 
valid memory locations for a task. The controller verifies the address value used by each instruction to 
ensure that it is within a valid memory area of the task. 

In another aspect of the invention, the multiprocessor system is coupled to a host computer through a 
buffer Data are serially written into the buffer by the host and are read out of the buffer in parallel by the 
multiprocessor system. The buffer used in this system includes apparatus which calculates an error 
correction code (ECC) from a serial data stream and passes this code, along with the data, to the 
multiprocessor system. The multiprocessor system includes apparatus which processes the data m parallel 
to handle errors indicated by the ECC. 

25 Brief Description of the Drawings 

— RgTTlPrior artfiFa block diagram of a computer system which includes a parallel processor. 

Fig. 2 (Prior art) is a block diagram of a parallel processor suitable tor use in the computer system 
shown in Fig. 1. 

Fig. 2A is a diagram that is useful for explaining the structure of the parallel processor system shown in 

Fig* \ is a data structure diagram which illustrates an internal table maintained by the a parallel 
processor controller which includes an embodiment of the present invention. 

Fig. 4 is a block diagram which is useful tor describing the interface between the processing elements 
and the controller of a parallel processor system in accordance with the present invention. 
35 Fig. 5 is a block diagram of a portion of an exemplary controller for a parallel processor system which 
includes an embodiment of the present invention. 

Fig. 6 is a block diagram which illustrates an extension of the controller apparatus shown in Fig. 5. 
Fig. 7 is a block diagram of an error correction system suitable for use in a parallel processor system 
which includes an embodiment of the present invention. 
40 Fig. 8 is a block diagram of error checking and correcting apparatus suitable for use in the portion of the 
parallel processor system shown in Fig. 7. 

Fig. 9 is connection diagram for a rule plane which corresponds to the exemplary algorithm presented m 
the text for generating an error correction code. 

Fig. 10 is a connection diagram for a rule plane which corresponds lo the exemplary algorithm presented 
45 in the text for using an error correction code for purposes of error detection. 
Detailed Description 

Fig 1 is a block diagram of a computer system which includes an attached multiprocessor system. The 
computer system includes a host processor 110 which is coupled to a main memory 112 via a memory bus 
MB The processor 110 is further coupled to peripheral devices via an I.O bus 120. The peripheral devices, 
may include, for example, a mass storage device 114, such as a disk drive, and an operator display 
terminal which may include a cathode ray tube (CRT) display device and a keyboard input device 116. In 
this embodiment of the invention, a parallel processor 118 is coupled to the LO bus 120 as a peripheral 
device 

Fig. 2 is a block diagram of an exemplary parallel processor 118. The processor shown in Fig. 2 
includes a controller 210 which is coupled to the l'0 bus 120 to receive commands from the host processor 
110. These commands determine the processing steps performed by an N by N processor array 212 on 
data values stored in a memory 214. Data transfers between the host processor 110 and the memory 214 
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are accomplished using a transfer controller 218 anrl an N 2 by K-bit buffer 216. 

In normal operation, the host processor 110 supplies data for each of the N 2 processors 212 to the bus 
120 as blocks of N 2 K-bit words. Each block is loaded into the buffer 216 by the transfer controller 213 
which is controlled by a program provided by the host processor 110 via the bus 120. As each block is 
s loaded, it is transferred into the memory 214 under control of the transfer controller 218. When the data 
values have been provided, the host processor 110 loads a program for the parallel processors 212 into the 
controller 210. The Controller 210 then sends a copy of each program instruction encountered during 
program sequencing to each processing element within the N X N Processor Array 212. 

Fig. 2A is a representation of an exemplary N X N Processor Array 212 with its associated memory 

io 214. While this figure shows a representation of specific dimensions, it is contemplated that the actual size 
of the structure can be increased or decreased as necessary to suit individual applications. The blocks 
shown in the foreground plane of this figure represent individual processing elements 212. The remaining 
blocks shown in this figure 214 represent individual memory elements. In the system shown in Fig. 2 & 2A, 
a processing element is capable of directly accessing any memory elements directly behind itself. 

75 In this system, a first processing element is also capable of accessing data in a memory location 
behind a second processing element. However, such access is only possible indirectly. For a first processor 
to retrieve data from a memory element behind a second processor, the second processor retrieves the 
data and transfers it to the first processor. For the first processor to store data in a memory element behind 
the second processor, the first processor transfers the data to the second processor and the second 

20 processor performs the actual storage operation. 

Each N X N plane of memory parallel to the processor plane is referred to as a memory plane. Access 
to a specific memory element is a function of the memory plane in which the memory element resides and 
the processing element which is directly in front of the memory element and, hence, can accecs it directly. 
Thus, a memory location is specified by the combination of a processing element number, a memory plane 

25 number and, optionally, an address within a set of memory elements having a common processing element 
number and memory plane number. In the embodiments of the invention described below, the processing 
elements are bit-serial devices and the memory planes are bit-planes. Consequently, each bit in the 
memory 214 may be uniquely identified by a processing element number and a bit-plane number. 

All processors in the N X N Processor Array execute the same instruction simultaneously. However, 

30 each processor manipulates data within a separate memory partition. Thus, the data manipulated by the 
various processors may be different. 

In this embodiment of the invention, a first, currently-running process may enter a waiting state or be 
preempted by second process at any time. If a second set of instructions, relating to the second process, is 
sent before a first process has completed execution in the N X N Processor Array, it is desirable to 

35 preserve a part of the current state of the computer so that execution of the first process can resume at a 
later time. 

It is possible, that the instructions for the second process may write information into a memory location 
that is still being used by the first process. In this way, data which may be used by the first process can 
become corrupted. 

40 This invention acts to restrict the memory locations that may be accessed by the second process or 

any other process. In this way, the corruption of data belonging to first processes may be avoided. 

In addition, the present invention verifies that an address is valid for an instruction in a particular 

process before the instruction is executed. If the address is invalid, then the current process is suspended 

and an interrupt is sent to an operating system. 
45 Fig. 4 shows exemplary circuits which determine the validity of an instruction/address pair. In Fig. 4, a 

single processing element 830 is shown. In a preferred embodiment of the disclosed invention, many 

processing elements 830 will exist. Each processing element 830 is coupled to an associated processing 

element memory array 840. The processing element accesses data from its associated processing element 

memory array via a switch 809. 
so The instruction decoder 802 determines whether the instruction/address pair which has just been read 

from program memory 803 will read or write memory 840, associated with the processing element 330. This 

information is then sent to a permission table 805. 

The permission table 805 determines whether a read or a write access is allowed at a specified 

address. There are several schemes that can be used to make this determination. These schemes may 
55 include: 

registers indicating upper and lower limits of a contiguous storage area; 
registers containing the size of a memory area: 

bits that indicate whether a particular section of memory may be accessed; 
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identification keys associated with particular sections of memory; 

one table of access rights for each process, each table being loaded into special registers immediately 
before the process starts running. 

In Fig 3 an exemplary permission table is shown. The table is organized such that the rows correspond 

s to respectively different section of memory and the columns correspond to respectively different parameters 
of a process. In the exemplary embodiments of the invention, each row may be ass.gned to a respectively 
different process or multiple rows may be grouped and the groups ass.gned to respectively different 
processes. It is understood that any scheme for determining the allowably of a memory access may be 
used In Figs 3, 5 and 6. the exemplary permission tables include Ft + 1 rows, correspond.ng to R + l 

,o sections of memory. In general, there may be any number of memory sections, although any number 
beyond the total number of memory planes is redundant. 

In the exemplary permission table, five columns are shown: LB. UB. RB. WB & PB. The LB column 
indicates a lower boundary of memory which can be accessed in the section of memory process associated 
with the row The UB column indicates an upper boundary of memory that can be accessed in the section 

is of memory associated with the row. The RB and WB columns respectively, indicate read and write 
protection for the section of memory specified by the LB and UB entries. 

The PB column is optionally used to restrict access to finer areas of memory. This column refers to a 
plane of memory called the permission bit plane. Before a memory access may occur at a memory 
location denoted by processing element address and bit plane address, a value which has been placed in a 

20 corresponding location of the permission bit plane is evaluated. This correspond.ng locat.on is called a 
permission bit plane location. If the permission bit plane location corresponding to a part.cular processing 
element contains a predetermined value, then that processing element will be allowed to access the 
memory location specified by the instruction subject to the read and write constraints of the RB and WB 
columns. As a special case, the permission bit planes may be ignored if the permission bit plane address. 

25 in the PB column of the permission table 805. is set to a pre-determined constant. 

If more flexibility is desired in the allocation of memory, the permiss.on table may .nclude multiple rows 
for each process. In this scheme, the permission table is reloaded before the start or resumption of each 
process and the table only contains the entries tor that process. The lower and upper bound registers. LB 
and UB, and the read and write bits. RB and WB are used to ascertain wh.ch row of the table is used to 

30 determine whether a given memory access is allowed. 

The contents of the permission table may be obtained from the host 110 or the data memory 801 of the 

control unit 820. , . „ 

The permission bit plane address is stored in permission table 805 as previously discussed. II a 

permission bit plane is used then the referenced permission bit plane address is transferred, via multiplexer 
35 807 to a control and arithmetic unit (CAU) 806 located in the processing unit 830. The CAU 806 uses the 

data stored in the local memory 808 at the permission bit plane address to determine whether access to a 

specific memory location is allowed. 

If according to the permission table 805. a bit plane access is not allowed for a specific process, the 

permission table 805 notifies the disable unit 804. which, in turn, signals the CAU 806 to inhibit th.s access. 
4 o In Fig 5 circuitry representing a logical implementation of the permission table of Fig. 3 is shown. The 

optional permission bit plane circuitry is not shown in this Figure. The circuit of Fig. 0 compares the 

address of a memory access with addresses stored in the permission table. This circuit also determ.nes 

whether read and write operations are allowed at a specified address. When a memory access .nstruction is 

transmitted from the instruction decoder 802 to the permission table 805. a comparator 504 determ.nes 
45 whether the location of this memory access is greater than or equal to a bit plane wh.ch .s des.gnated .n 

lower bound register 501. This register corresponds to the lower bound value LBR shown in Fig. 3. 

The location of this memory access is also evaluated by a comparator 510. Comparator 510 determ.nes 

whether the location of the memory access is less than a bit plane which is designated m upper bound 

register 507. 

so The output terminals of comparator 504 and comparator 510 are coupled to the input terminals of an 
AND gate 513 If the referenced memory location is within the bounds designated by the lower bit plane 
register 501 and the upper bit plane register 507. a logical true value will appear on the output terminal of 
the AND gate 513. This signal is applied to respective input terminals of AND gates 515 and 521. If a 
memory read operation is allowed as a result of the value stored in read bit register 516. a logical true value 

ss will appear on the output terminal of AND gate 515. causing a logical false value to appear on the output 
terminal of NOR gate 529. Similarly, if a memory write is allowed by virtue of values stored in write bit 
register 522, a logical true value will appear on the output torminal of AND gate 521. causing a false logic 
value to appear to on the output terminal of NOR gate 532. 
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The signals IR and IW are provided by the instruction decoder. These signals indicate whether the 
instruction reads or writes data, respectively. If an instruction reads data and gate 529 indicates that a road 
is impermissible, a logical true value will appear on the output of gate 529. This value causes a logical true 
value, to appear on the output of AND gate 528. If an instruction writes data and the gate 532 indicates that 
s a write is impermissible, then a logical true value will appear on the output of gate 530 and 532. This 
causes a logical true value to appear on the output of AND gate 531 . The signals provided by the or gates 
528 and 531 are applied to the OR gate 533 and condition it to provide a logical true level at its output 
terminal. This value is applied to the input terminal of disable unit 804 to block execution of the instruction 
performing the impermissible memory access. 

70 Fig. 6 shows exemplary circuitry which implements a protection scheme that includes permission bit- 
plane masking. By using permission bit- plane masking, instruction access may be restricted to finer areas 
of memory (i.e., less than a bit plane). This is useful in systems where multiple users share a SIMD 
computer memory. In addition, bits in a permission bit plane may be used to reserve selected areas of 
memory as a system resource or to prevent access to defective memory cells. Permission bits can also be 

75 used for debugging by showing where programs are reading or writing data at improper memory locations. 

The logic circuitry shown in Fig. 6 which is associated with the lower bound registers, LBO through LBR. 
upper bound registers, UBO through UBR, read bit-plane registers, RBO through RBR and write bit registers 
WBO through WBR is identical to the corresponding circuitry shown in Rg. 5. Three additional three-state 
gates, 701 , 703 and 705, three permission bit-plane registers 702, 704 and 706 and a permission bit plane 

20 address bus 707 are added in Fig. 6. 

In the example set forth above, the contents of the permission bit-plane register corresponding to the 
selected contiguous storage area is detected by the circuitry in Fig. 5 that includes the lower and upper 
bound registers and their associated comparators. This value is placed on the permission bit-plane address 
bus 707. It is noted that the various address ranges specified by the lower and upper bit-plane registers are 

25 desirably disjoint, or meaningless values may be placed on the permission bit- plane address bus 707. 

In one embodiment of the disclosed invention, the value on the permission bit plane address bus 707 is 
interpreted as the address of a bit- plane in the memory 840. This addressed bit-plane contains permission 
bits that indicate which of the processing elements 830 are allowed to access their associated memory 
arrays during the execution of the instruction. If the contents of the permission bit- plane for a particular 

30 processing element is a first specified value (e.g. 0), then access to the corresponding memory location by 
that processing element is inhibited. Otherwise, access is allowed. If subsequent decoding of the instruction 
indicates that a memory operation is to take place, this condition is detected and an interrupt is forwarded 
to the control unit to indicate an attempt was made to access an invalid memory location for the processing 
element. 

35 In an alternate embodiment of the invention, the scheme described above is extended so that, if a 
particular address value (e.g. 0) appears on the permission bit-plane address bus 707, then the reading and 
checking of the permission bit-plane may be skipped for the instructions being executed. This may be 
desirable as a means to shorten the instruction cycle time in cases where the finer degrees of protection 
provided by the permission bit-plane scheme are not desired. 

40 In another alternative embodiment of the invention, the above schemes are modified so that the 
permission bit plane resides in a memory bit-plane having an address that is either a fixed value or a value 
designated by a register in the control unit 820. In this instance it is only necessary that the bit-plane 
registers 702, 704 and 706 and the permission bit-plane address bus convey a single bit of information, 
indicating whether the permission bit-plane should be accessed. 

'75 In yet another alternative embodiment, the first two schemes described above are modified so that the 
permission bit plane occupies a set of registers 808, one set per processing element. In this instance, it 
may be possible to read, test and act on the permission bits more quickly than if they are stored as a part 
of the processing element memory array 840. For this embodiment of the invention, the permission bit- 
plane address bus 707 conveys only a single-bit of information: whether the permission bit- plane registers 

50 808 are to be used for the instruction. 

In still another alternative embodiment, the preceding scheme is extended to a set of multiple-bit 
registers 808 holding the contents of multiple permission bit planes. Permission bit plane address bus 707 
selects one of the permission bit planes. A particular address value (e.g. 0) is used as in other schemes 
above to disable checking. 

55 SIMD computer systems may be used in conjunction with Von Neumann type computers for program- 
ming ease. Instructions are entered into the Von Neumann type computer (called the host) and are then 
transferred to the SIMD computer, where they are executed by several processors simultaneously. An 
example of this configuration is shown in Fig. 2. 
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" The SIMD system described above may often participate in data transfers between itself and a host 
computer as shown in Fig. 2. While an SIMD system and a host system may each have their own error 
correction, schemes neither system can verify the integrity of data transferred between the two. 

Piq 7 i S a block diagram of an error correction system suitable for use in the parallel processor system 

s shown in Fiq 2 This error correction system facilitates the transfer of data between the host 302 and the 
memory cell array 312. The host 302 interfaces with a butter 308 between itself and the SIMD system via 
an I/O channel The I/O channel is K-bits wide. However, the SIMD system includes N 2 processors where N- 
may not be equal to K. The buffer 308 is N 2 X K bits and is used as an intermediary form of storage 
between the host 302 and the memory cell array 312. This N 2 by K-bit buffer 308 is accessed by the host 

,o 302 as a K-bit entity for each access. The K-bit entity is referred to as a word. The memory cell array 312 
accesses the N z by K- bit butter 308 as an N 2 -bit entity for each access. The N-b.t access entity 

corresponds to a bit-plane. 

The buffer 308 is organized as a two- dimensional structure because the memory organization for the 
host system and the SIMD system are different. At the host end. data is organized as a K-b.t word and N* 
75 units of data are passed sequentially via the I/O channel. These N 2 data values, however, arc distributed in 
bit-serial form to N 2 SIMD processors and are organized in K consecutive memory bit locations in each 

Pr °As S a result of the two dimensional organization of the butter 308 used in the exemplary system, each 
SIMD access to the buffer 308 involves one bit of each of the N 2 K-bit words. Conversely, each access by 
20 the host 302 involves only one K-bit word. 

Because the buffer 308 "corner turns" the transferred data, current error correction codes and schemes 
are not applicable. Error correction codes in this case are not meaningful because only partial information 
required by the error correcting code scheme is made available at each bit-plane access. For the same 
reason the error correcting code for the bit-plane is not mean.ngful because only one bit out of M bits of 
the error correction information is available in any one access. Thus, the buffer which exists between the 
host and the SIMD system is unprotected. Data may be unknowingly corrupted due to a failure of the buffer, 
reqardless of error protection at both the host and the SIMD unit. 

In this embodiment of the invention data integrity is preserved through the use of an error correction 
code (ECC) Data transfers between the host 302 and the N 2 by K-bit buffer 308 occur through an ECC 
qenerator check circuit 306. Data transfers between the memory cell array 312 and the N- by K-b.t buffer 
308 occur through an ECC circuit 310. In actual practice, the ECC circuit 310 may reside w.th the memory 
cell array 312 to form a memory board 330. A transfer controller 304 controls the operation of the ECC 
qenerator and check circuit 306 and the ECC circuit 310. 

The following definitions are useful for understanding the operation of the error correction code circuitry: 
35 ECC(i) - One of i error correction code circuits. 
C(i) - A bit i of an error correction code word. 

B(W)(i) - An i-th bit of a word W received from the host 312 via the I O channel. 

By using a rule R, an ECC(i) can be generated for a B(W)(.) of W words. The rule R can be any error 
correction code. An exemplary rule R for W words of length 16 bits follows. For this example, the W of B- 
40 (W)(l) is constant and is hence deleted, CX is the check bit. A rule R to generate the error correction code 
bits which is well known in the field of error correction is a modified Hamming code and is as follows: 
CX = B1 xor B2 xor B3 xor B5 xor B8 xor B9 xor B1 1 xor B14 
CO = B0 xor B1 xor B2 xor B4 xor B6 xor B8 xor B10 xor B12 
C1 = B0 xor B3 xor B4 xor B7 xor B9 xor B10 xor B13 xor B15 xor 1 
45 C2 = B0 xor B1 xor B5 xor B6 xor B7 xor B1 1 xor B12 xor B13 xor 1 
C4 = B2 xor B3 xor B4 xor B5 xor B6 xor B7 xor B14 xor B15 
C8 = B8 xor B9 xor B10 xor B1 1 xor B12 xor B13 xor B14 xor B15 

The relation between the error correction code bits C and the data bits B is summarized in Table 1 . 
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The Error Correction Code (ECC) for the buffer can be generated in two ways: one is related to writing 
the buffer from the host side, and the other is related to writing the buffer from the SIMD side. In both cases 
the same rule R is applied to generate the code bits. These generated code bits are stored in an extra 
memory dedicated to the ECC. For the data written by SIMD system, the rule R, to generate output code 
20 bits, can be implemented by a commercial ECC part (e.g. AMD296072960A manufactured by Advanced 
Micro Devices) because the N X N bits of input data are simultaneously available to the ECC generation 
circuitry. 

By contrast the data to generate the equivalent ECC for writing the buffer from the host side is 

available sequentially. As a result, the ECC code bits should be generated by evaluating the rule R 
25 sequentially. A new circuit is provided for the two-dimensional buffer protection. 

In Fig. 8, a circuit is shown for the generation and analysis of error correction codes. A demultiplexer 

404 (also represented as a state decoder) distributes incoming data from the host to a proper column of a 

rule plane 406. A rule plane 406 directs the data from the column to the appropriate row according to the 

algorithm set forth by rule R. In a preferred embodiment of the invention, the rule plane consists of fixed 
30 pattern of interconnections which are used to connect selected rows to selected columns. An example of a 

rule plane which is used to implement the algorithm of Table 1 is shown in Fig. 9. EXCLUSIVE-OR gates 

408, 410 cind 412 are used for performing the EXCLUSIVE-OR function in relationship to rule R. A state 

recorder 430 is used for calculating and maintaining intermediate data in relationship to rule R. The state 

recorder 430 will clock appropriate flip-flops 414, 416 and 418 in a pre-determined order to execute the 
35 algorithm specified by rule R. After the state recorder 430 has provided 16 clock signals the flip-flops 414, 

416 and 418 contain the bits which constitute the error correction code. This error correction code is stored 

in a portion of memory adjacent to the N 2 by K-bit buffer 308. 

The ECC checking is the counterpart of the ECC generation. ECC checking is performed when (1) 

reading the buffer from the SIMD side, and (2) reading the buffer from the host side. A read operation from 
40 the SIMD system will read out both the data bits and the code bits generated according to R. Since the 

data and the code bits are available simultaneously, at the input port to the SIMD system, this ECC 

checking can be implemented by using a commercial ECC part (e.g. AMD2960/AMD2960A). 

Error correction code checking at the host sido may be performed by using a circuit which is similar to 

that illustrated in Fig. 8. However, the rule plane of Fig. 10 is substituted for the rule plane of Fig. 9. Error 
45 correction code checking is accomplished by using a set of six syndrome bits (SX,S0,S1,S2,S4,S8). 

Syndrome bits may be generated in the state recorder of Fig. 8 by using the rule R' set forth below. An 

example for one of W words of length 16 bits follows: 

SX = B1 xor B2 xor B3 xor B5 xor B8 xor B9 xor B1 1 xor B14 xor CX 

SO = B0 xor B1 xor B2 xor B4 xor B6 xor B8 xor B10 xor B12 xor CO 
so S1 =. B0 xor B3 xor B4 xor B7 xor B9 xor B10 xor B13 xor B1 5 xor C1 

S2 = B0 xor B1 xor B5 xor B6 xor B7 xor B1 1 xor B12 xor B13 xor C2 

S4 = B2 xor B3 xor B4 xor B5 xor B6 xor B7 xor B14 xor B15 xor C4 

S8 = B8 xor B9 xor B10 xor B11 xor B12 xor B13 xor B14 xor B15 xor C8 

Syndrome bit generation differs from error correction code bit generation in that syndrome bits are a 
55 function of the error correction code bits. 

By using syndrome bits in conjunction with Table 2, it is possible to determine, not only whether a 

single-bit error has occurred, but also the bit location of this error and if certain multiple-bit errors have 

occurred. 
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* -> no errors detected 

Number -> location of the single bit-in-error 

T -> two errors detected 

M -> three or more errors detected 



Claims 



1. A method of restricting a memory access by a computer program instruction in a Single Instruction 
Multiple Data (SIMD) computer which includes a plurality of processing elements wherein each processing 
element is associated with a respective partition of a memory, and further including a table containing 
memory addresses which indicate allowed memory access, said method comprising the steps of: 

a) analyzing the computer program instruction to determine whether the instruction will access the 
memory associated with the SIMD computer and the address in the memory which it will use: 

b) evaluating the table to determine if memory access is allowed at the determined address location for 
any of the plurality of processors; 

c) selectively inhibiting the execution of instruction based upon the determination of step b). 

2. A method of restricting a memory access by a computer program instruction in a SIMD computer 
including a memory, a table containing address values for data in the memory, and respective permission 
read bits and permission write bits which indicate, for the respective address values, whether read 
operations and write operations, respectively, are allowed, said method comprising the steps of: 

a) analyzing the computer program instruction to determine whether the instruction will road data from or 
write data lo the memory and what address it will use; 

b) evaluating the table to determine if any access to the memory is allowed at the determined address; 

c) evaluating the permission read bits, if the instruction performs a memory read operation, to determine 
if a memory read operation is allowed at the determined address: 

d) evaluating the permission write bits, if the instruction performs a memory write operation, to determine 
if a memory operation is allowed at the determined address; 

• e) selectively inhibiting the execution of the instruction based upon the results of steps b>, c) and d). 

3. The method of claim 2, wherein at least two of the steps a) - d) are performed concurrently. 

4. The method of claim 3, wherein a table is maintained for a plurality of computer program instructions 
which constitute a process, and separate tables are maintained for a plurality of processes, further 
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comprising the step of selecting the table to use based on the process being executed by the SIMD. 

5. The method of claim 4, wherein a plurality of tables are stored in a first memory of the SIMD computer, 
and each of said plurality of tables corresponds to a process which is running on the SIMD computer, said 
method further comprising the step of transferring one of said plurality of tables from the firsf memory to a 

5 second memory of the SIMD computer for performing the steps b), c) and d). 

6. The method of claim 5 wherein the tabic includes a plurality of protective memory plane addresses, 
representing respective memory locations within a memory plane of the SIMD system and said method 
further comprises the step of: 

d1) if the memory operation is allowed by step d) conditioning each processing element to determine 
w whether the memory operation is allowed. 

7. A method for verifying correct transfer of data between a host computer and a SIMD computer through a 
buffer in which a set of m words, each word including n bit positions, enters the buffer, wherein a common 
bit position for the set of m words enters the buffer simultaneously for each of the n bit positions, and, a set 
of x words, each word including y bit positions leaves the buffer simultaneously, wherein a common bit 

rs position for the set of x words leaves the buffer simultaneously for each of y bit positions, each common bit 
position for the set of m words corresponding to a word of the set of x words, each common bit position for 
the set of x words corresponding to a word of the set of m words, said method comprising the steps of: 
a) generating, sequentially, an error correction code for each common bit position for the set of m words 
to produce n error correction codes: 
20 b) transferring the set of m words to the buffer: 

c) transferring the set of n error correction codes to the buffer; 

d) transferring the set of x words and n error correction codes from the buffer; and 

e) evaluating said x words and n error correction codes to determine if data errors have occurred in any 
of transferring steps b), c) or d). 

25 8. The method of claim 7, wherein the steps b) and c) transfer the set of m words and the set of n error 
correction codes from the host to the buffer. 

9. The method of claim 7, wherein the steps b) and c) transfer the set of m words and the set of n error 
correction codes are transferred from the SIMD computer to the buffer. 

10. The method of claim 7, wherein each of the n error correction codes includes a plurality of check bits 
30 which are generated according to a modified Hamming code algorithm, wherein said Hamming code 

algorithm generates a check bit by repeatedly applying an EXCLUSIVE-OR function to a plurality of 
predetermined bit positions of a word of the set of m words and an individual bit position is used not more 
than once in the generation of said check bit. 

11. The method of claim 10, wherein a rule plane is used to direct selected bit positions of a word of the set 
35 of w words to the EXCLUSIVE-OR function. 

12. The method of claim 7, wherein the set of x words and the set of n error correction codes are 
transferred to the host. 

13. The method of claim 7, wherein the set of x words and the set of n error con'ection codes are 
transferred to the SIMD computer. 

•to 14. The method of claim 7, wherein said n error correction codes are evaluated in conjunction with said set 
of x words wherein step e) includes the step of generating a syndrome bit from a modified Hamming code 
generation algorithm by sequentially applying an EXCLUSIVE-OR function with the output signal of the 
EXCLUSIVE-OR function as one input signal to the EXCLUSIVE-OR function and a selected error correction 
bit and selected bit positions of the set of x words as a second input signal to the EXCLUSIVE-OR function 

^5 wherein an individual bit position of the set of x words is used not more than once in the generation of said 
syndrome bit. 

15. The method of claim 14 wherein the EXCLUSIVE-OR function is repeatedly applied by using an 
EXCLUSIVE-OR circuit and a latch circuit, the output terminal of the EXCLUSIVE-OR circuit is coupled to 
provide an output value to an input terminal of the latch, an output terminal of the latch is coupled to a first 
50 input terminal of the EXCLUSIVE-OR circuit and step f) includes the steps of: 
f 1 ) clearing the latch; 

f2) placing a logic level corresponding to one of said predetermined bit positions of a word of the set of x 
words on a second input terminal of the EXCLUSIVE-OR circuit; 

f3) storing the logic level which appears on the output terminal of the EXCLUSIVE-OR circuit into the 
55 latch; 

f4) repeating the steps f2) and f3) for each predetermined bit position of the word of the sot of x words 
v/hich are used in generating the syndrome bit; 

f5) placing a logic level corresponding to a selected bit of the error correction code on a second input 
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•i 

terminal of the EXCLUSIVE-OR circuit; 

f6) storing the logic level which appears on the output terminal of the EXCLUSIVE-OR c.rcu.. -n the latch. 

16 The method of claim 15. wherein steps f2) and f5) include the step of using a rule plane to direct the 
selected bit positions of the word of the set of x words and the selected bit of the error correct.on code to 

5 the EXCLUSIVE-OR circuit. , 

17 The method of claim 14 wherein the EXCLUSIVE-OR funct.on is repeatedly apphed by us.ng an 
EXCLUSIVE-OR circuit and a latch circuit, the output terminal of the EXCLUSIVE-OR circuit is coupled to 
provide an output value to an.input terminal of the latch, an output terminal of the latch .s coupled to a nrst 
input terminal of the EXCLUSIVE-OR circuit, and step a) of comprises the steps of: 

to a1) clearing the latch; 

a2) placing a logic level corresponding to a selected bit position of a word of the set of m words on a 

second input terminal of the EXCLUSIVE-OR circuit; , 

a3) storing the logic level which appears on the output terminal of the EXCLUSIVE-OR c.rcu.t in the 

latch' * 
75 a4) repeating the steps a2) and a3) for each selected bit position of the word of the set of m words 

18. The method of claim 15 wherein a rule plane is used to direct selected bit positions of a word of the set 
of x words to the EXCLUSIVE-OR circuit. , . ^ ♦ a 

19. The method of claim 17 wherein a rule plane is used to direct a check bit and selected bit pos.t.ons of a 
word of the set of k words to the EXCLUSIVE-OR circuit. 

20 20. A method of transferring data between a host computer and an SIMD computer, said method 
comprising the steps of: 

a) serially encoding data; 

b) transmitting the seria ; ly encoded data to the SIMD computer: 

c) receiving the serially encoded data from the host computer: and 
25 d) decoding the serially encoded data in parallel. 

21. A method of transferring data between an SIMD computer and a host computer, sa.d method 

comprising the steps of: 

a) encoding the data in parallel; 

b) transmitting the data that has been encoded in parallel to the host computer: 

30 c) receiving the data that has been encoded in parallel from the SIMD computer; and 

d) serially decoding the data that has been encoded in parallel. 
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© In a multiprocessor system, memory accesses 
by the individual processing elements are checked 
by a common controller (320). The controller in- 
cludes a table of values defining valid memory loca- 
tions for a task. The controller verifies the address 
value used by each instruction to ensure that, it is 
within a valid memory area for the particular task. 
Additional circuitry for the controller and processing 
elements allows finer control of memory accessibil- 
ity. The multiprocessor system may be coupled to a 
host computer through a buffer (308). Data is serially 
written into the buffer (308) by the host (302) and is 
read out of the buffer (308) in parallel by the mul- 
tiprocessor system. The buffer used in this system 
includes apparatus which calculates an error correc- 
tion code from a serial data stream and passes this 
code, along with the data, to the multiprocessor 
system. The multiprocessor system includes appara- 
tus (310) which processes the data in parallel to 
handle errors occurring during transfers as indicated 
by the code. 
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